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Abstract. A real-time capable core turbulence tokamak transport model is 
developed. This model is constructed from the regularized nonlinear regression of 
quasilinear gyrokinetic transport code output. The regression is performed with a 
multilayer perceptron neural network. The transport code input for the neural network 
training set consists of five dimensions, and is limited to adiabatic electrons. The 
neural network model successfully reproduces transport fluxes predicted by the original 
quasilinear model, while gaining five orders of magnitude in computation time. The 
model is implemented in a real-time capable tokamak simulator, and simulates a 300 s 
ITER discharge in 10 s. This proof-of-principle for regression based transport models 
anticipates a significant widening of input space dimensionality and physics realism 
for future training sets. This aims to provide unprecedented computational speed 
coupled with first-principle based physics for real-time control and integrated modelling 
applications. 

Introduction- Particle, momentum, and heat transport in the tokamak core is 
dominated by turbulence driven by plasma microinstabilities 0 E], An accurate 
predictive model for turbulent transport fluxes is thus vital for the interpretation and 
optimization of present-day experiments, and extrapolation to and control of future 
machines. 

Direct numerical simulation with massively parallel nonlinear gyrokinetic codes 
has provided tremendous insight to the underlying transport physics and success in 
reproducing experimental fluxes in many regimes. However, the computational cost - 
typically 10® CPU hours for a local flux calculation at a single radial point - precludes the 
routine use of such codes for integrated tokamak transport simulations which demand 
~ 10^ flux computations per 1 s of plasma evolution on JET scale devices. 
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Reduced turbulent transport models have been constructed to increase tractability. 
They are based on the quasilinear approximation, which is proven to be largely valid 
in the core of tokamak plasmas [31 SI E]. These rely on nonlinear simulations for 
validating their ansatzes and normalizing factors. These models have proven successful 
in reproducing experimental prohles in many cases. Examples are TGLF [6] and 
QuaLiKiz [7]. A ~6 orders of magnitude speedup is gained in quasilinear calculations 
compared to nonlinear simulations. However, while extremely useful, the tractability 
of such models is still marginal for convenient large-scale scenario development over 
discharge timescales. For example, 1 s of JET plasma evolution can take up to 10 hours 
with 10 processors, depending on the integrated modelling platform used. This speed 
is also insufficient for applications such as trajectory optimization, and simulations for 
developing real-time controllers. Furthermore, any increase in physics fidelity in the 
models often results in a trade off with further decrease in tractability. 

This Letter illustrates an approach to overcome these challenges. The central point 
is to relegate the expensive flux calculations to a stage precedent to its use in a transport 
simulation. Instead, analytical formulae are to be used in the simulation, based on 
a neural network (NN) nonlinear regression of quasilinear fluxes previously compiled 
in a database. The advantage is twofold: 1) the numerical resolution of analytical 
formulae is orders of magnitude faster than original flux calculation; 2) the computation 
time required for compiling the database is independent from the computation time 
spent during the tokamak simulation itself, hence the training set for NN regression 
can include results from more complete codes than used in contemporary integrated 
transport modelling. 

Neural networks have found multiple applications in tokamak research, including: 
nonlinear regression for energy conhnement scaling |8]; neoclassical transport [9]; rapid 
determination of equilibria jlO] , electron temperature prohles and charge exchange 
spectra ca; classihcation of disruption [131 El E] and L-H transition onsets [I6] . Most 
related to this work is a regression of Dlll-D heat huxes from experimental power balance 
databases na. 

Quasilinear transport model and training set- The QuaLiKiz quasilinear 
gyrokinetic transport model [313 EHl EHl |20] was employed in this work. QuaLiKiz solves 
a linear gyrokinetic dispersion relation for calculating wavenumber spectra of instability 
growth rates and frequencies. Then, integrating over the spectra, the transport huxes are 
calculated via quasilinear hux integrals and nonlinear saturation rules. The bulk of the 
computational time is spent in the hrst stage, the dispersion relation solver. QuaLiKiz 
has been coupled to the CRONOS |2T] integrated modelling suite, and has successfully 
reproduced temperature and density prohles of JET and Tore-Supra discharges [2^l23] . 
Following recent upgrades [23], the computational time for the QuaLiKiz eigenvalue 
solver at a single wavenumber in QuaLiKiz is on the order of ~1 s. 

A database of QuaLiKiz solutions was constructed, in the ion temperature gradient 
(ITG) instability regime. This instability is often the primary driver of tokamak 
microturbulence. The code was run with adiabatic electrons for simplicity, which 
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Table 1. Summary of input parameters for the QuaLiKiz adiabatic electron ITG 
database employed in this work 


Parameter Min value Max value No. of points 

R/Lxi 

2 

12 

30 

T/R 

0.3 

3 

20 

q 

1 

5 

20 

S 

0.1 

3 

20 

kePs 

0.05 

0.8 

16 

Total 

no. of points 

3 840 000 


also decreases the computational time to ~300 ms. The database covers four input 
parameters known to have significant impact on ITG transport fluxes in this regime: 
the driving normalized logarithmic ion temperature gradient R/Lti, the ion to electron 
temperature ratio Ti/Tf., the safety-factor q, and the magnetic shear s = In 

addition, the input normalized wavenumber keps'wa.s scanned, constricted to above 
ion-Larmor-radius scales, where ps = y/Tf>mi/(Ziq^B). The following parameters were 
maintained fixed: the normalized logarithmic density gradient R/L^ = 3, normalized 
radial location r/a = 0.5. No Shafranov shift was assumed in the geometry. The 
database consists of a dense grid of points summarized in table [H from which the training 
sets for the neural network were sifted. The QuaLiKiz outputs we investigate are: 
growth rates and frequencies, which correspond to 5D input space; ion heat flux, which 
corresponds to 4D input space due to integration over wavenumbers. The database 
includes cases corresponding to unstable modes, and cases where no instabilities were 
found by QuaLiKiz, and the outputs are set to 0. 

A regression of the ion heat flux has immediate application for transport modelling. 
However, a regression of the more primitive linear output has its own specific 
applications. For example, since the dispersion relation solver is the slowest part of 
the code, a fast reproduction of growth rates, frequencies and eigenfunctions would 
allow rapid tests of various saturation rule formulations throughout parameter space. 
These saturation rules typically evolve following continuous comparisons with nonlinear 
simulations and experiments. In this sense, a database consisting of the complete 
outputs of linear codes does not become obsolete, while a quasilinear flux database 
can. 

Neural networks - The goal is to find analytical formulae which robustly reproduce 
the various QuaLiKiz outputs. To this end, a multilayer perceptron neural network is 
used, which is a nonlinear function with tunable variables (weights and biases), with 
the property of universal approximation [25l [26]. For an overview, with an emphasis 
on applications for fusion, see Ref. |27j. Linear combinations of the inputs and biases 
are propagated through a series of nonlinear transfer function vectors (named ‘hidden 
layers’), until eventually linearly combined to an output layer. With two hidden layers 
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and a single ontpnt valne (as nsed in this work), this is represented as: 

y = h + Yl [h’l + E w]^9 [h] + j (1) 

Where y is the ontpnt ‘nenron’ containing the ontpnt valne (i.e. growth rate, freqnency, 
or ion heat flnx), Xk the vector of inpnt valnes, the bias vectors, w*"' the Mxl weight 
matrix connecting the inpnt vector to the 1st hidden layer, the NxM weight matrix 
connecting the two hidden layers, and the weight vector connecting the 2nd hidden 
layer to the ontpnt nenron. g is the nonlinear transfer fnnction, dehned as a sigmoid in 
this work: 

1 P) 

1 + e 

Following a series of optimization tests, two hidden layers, as shown in eqnation [H were 
employed here. The hidden layer sizes M and N were set to 40. The inpnt layer size, I, 
is 4 for ion heat flnxes, and 5 for growth rates and freqnencies. 

The key stage is the determination of the optimized valnes of the weights and 
biases. This is done by minimizing a cost fnnction consisting of the average sqnared error 
between the network ontpnt and known target ontpnt. This set of target ontpnt, known 
as the ‘training set’, is a snbset of the QnaLiKiz ontpnt valnes from the database. The 
BFGS algorithm [28], implementing a qnasi-Newton method, was nsed for the weight 
and bias optimization. All NN training in this work was carried ont with the MATLAB 
nenral network toolbox [29|. Following training, the network ontpnt then emnlates the 
original model within the database inpnt parameter envelope. This is validated by 
comparison to validation sets sifted from the database, which are different from the 
training set. 

To avoid overhtting the data, regnlarization techniqnes were nsed in the regression. 
This corresponds to adding a penalty term in the cost fnnction related to the snm of 
sqnares of the network weights and biases, leading to smoother ontpnt. The nse of 
regnlarization ensnres that the NN response is smooth (e.g. withont strong oscillations) 
in sparse regions of training set parameter space or when extrapolating beyond the 
training set envelope. 

The analytic form of the nonlinear regression fnnction allows for the calcnlation 
of analytical gradients of the ontpnts with respect to the inpnts. This is vital for the 
efficient solntion of fast implicit schemes in real-time capable core transport simnlators 
snch as RAPTOR [30]. The regnlarization also ensnres smooth gradients thronghont 
parameter space, important for the stability of snch implicit schemes. 

Regression results - We focns on the ion heat flnx NN regression, dne to its direct 
relevance for transport modelling applications. Snccessfnl regressions of the growth rates 
and freqnencies were also obtained bnt for brevity not discnssed here. 

To captnre the instability thresholds with high hdelity, the regression was only 
carried ont for a training set corresponding to nnstable modes. The NN ontpnt for 
the stable regions in the validation set was then negative, since the regnlarized network 
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tends to smoothly extrapolate the trends observed towards the training set envelope. 
For the final heat fiux output, these negative values were then set to zero to represent 
stability. This scheme avoids having the regularized regression network attempt to 
directly fit the discontinuous gradients at the instability thresholds, which would be 
performed poorly due to to the regularization constraint. This is an important point 
since tokamak transport often tends to be maintained near the critical temperature 
gradient thresholds, especially in high temperature regimes. 

The network was trained with a training set of 35000 points chosen randomly from 
the set of unstable modes in the database. A comparison between the regression NN and 
QuaLiKiz outputs for a validation set of 10000 unstable cases (different from the training 
set) is shown in figure [1] The regression network has an RMS error of 0.77 in gyroBohm 

7,3/2^172 

units {xiGB = (z g B)2r ) Compared to the validation set. This RMS error is similar 
for the training set itself and is primarily due to the regularization constraint. The 
impact of this error on the simulated profiles is minor. This is due to stiffness, defined 
here as the local gradient of Xi with respect to the driving R/LTi- To quantify this, a 
comparison was made between the R/Lti values predicted by the NN and QuaLiKiz to 
balance a representative XiGB = 1- This was done for all values of g, s, and Tj/Tg in 
the database. The RMS error was A (R/Lti) = 0.29, which corresponded to an average 
relative error of only 4.2%. 

The typical quality of the fits can be seen in figure [21 displaying scans of the 4 
separate input parameters while the others remained fixed. Negative outputs of the NN 
network are set to zero. Note the resulting excellent fit of the instability thresholds. 
In addition, extrapolating the NN scans beyond the range of the training set maintains 
the trend observed in the data, due to the regularization. This is very encouraging 
with regard to extension of this approach to more sparse datasets in higher dimensions. 
However, we do not intend to routinely use NN models in poorly represented regions of 
parameter space, as the quality of extrapolation cannot be determined a priori. Rather, 
the training sets should be continuously expanded to cover such encountered sparse or 
empty regions, and the NN then periodically retrained. Nevertheless, the smoothness 
of the regularized NN response when extrapolating ensures its robustness and stability 
during practical use as a transport model, including during phases when such sparse 
regions are encountered. 

Each NN output is calculated on a sub 10 fis timescale in MATLAB on a Intel(R) 
Xeon(R) E5450 CPU @ 3.00GHz. This is a 5 order of magnitude speedup in comparison 
to the original QuaLiKiz calculations. 

Application in transport codes.-A transport model based on the trained neural 
network was constructed, and implemented both in the CRONOS [21] and RAPTOR [30] 
integrated modelling codes. 

In CRONOS, the validity of the NN transport model was assured by a successful 
comparison with a JET baseline H-mode shot 73342 with ion and electron heat transport 
previously simulated [23] with the full QuaLiKiz model. For brevity we will not focus 
further on this case. Rather, we focus on the real-time simulation capabilities offered 
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Figure 1. Comparison between normalized ion heat fluxes obtained directly from 
QuaLiKiz (x-axis) and those from its NN regression (y-axis) 


by coupling the NN model to RAPTOR. 

Presently, RAPTOR only models electron heat transport. The NN model output 
was thus modihed to roughly approximate ITG regime electron heat transport with 
kinetic electrons. This was done by assuming that heat fluxes in ITG kinetic electron 
cases are higher by factor 3 compared with adiabatic electron cases, and furthermore 
assuming an ion to electron heat flux ratio of Qi/qe = 3. These approximations are based 
on typical nonlinear and quasilinear observations in the ITG regime [SB El- 

In figure 131 we compare a RAPTOR simulation of an ITER hybrid scenario, using 
the QuaLiKiz NN model for electron heat transport, with a simulation of the same 
case originally carried out [32] using GRONOS and the GLF23 [33] transport model. 
Using GLF23 allows to compare over ITER-scale discharge times of >100s, which is less 
tractable using the original QuaLiKiz model. For heat transport in a pure ITG regime, 
GLF23 and QuaLiKiz predictions are expected to be similar, as illustrated in specific 
single-time-slice comparisons [23] . 

The RAPTOR simulation uses all the same actuator (source) inputs and density 
evolution as the GRONOS simulation. Ion temperatures were held fixed at Tj/Te ~ 0.8 
in L-mode and Tj/Tg ~ 0.9 in H-mode. The NN model was operational within a 
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Figure 2. Comparison of NN parameter scans (blue solid lines) vs the original 
QuaLiKiz ion heat flux calculations (red dots). The scans are in R/Lxi (top left 
panel), Ti/Te (top right panel), q (bottom left panel) and s (bottom right panel) 


normalized toroidal flux coordinate (p) range of 0.25 to 0.95. For p > 0.95, Xe was 
feedback controlled to maintain a prescribed edge pedestal temperature of 4 keV. For 
p < 0.25, a constant Xe was assumed to maintain a reasonable level of transport, 
since GLF23 and QuaLiKiz both predicted stability within that region. A RAPTOR 
simulation of an entire 300 s ITER discharge took 10 s on a single CPU, corresponding 
to 30x faster than real-time. This combination of simulation speed and first-principle 
modelling is unprecedented. With CRONOS/GLF23, the same simulation took 24 
hours. 

Conclusions and outlook- A neural network fit to a restricted subspace of 
quasilinear gyrokinetic transport model calculations, relevant in the ITG regime, was 
carried out and applied as a transport model for integrated modelling. While the 
quasilinear model, QuaLiKiz, is 6 orders of magnitude faster than nonlinear simulations, 
the NN regression leads to a further 5 order of magnitude speedup. This model is thus 
real-time capable while still being based on first-principles, which is unprecedented. 
This model has been coupled to the CRONOS integrating modelling suite, and validated 
against a full QuaLiKiz simulation in the ITG regime. The model is also coupled to 
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Figure 3. Comparison between the predictions for an ITER hybrid discharge 
carried out with CRONOS/GLF23 [32] (red curve) and a RAPTOR simulation using 
the QuaLiKiz NN transport model (blue curve). A typical H-mode profile (left panel) 
and time dependence at mid-radius (right panel) are shown. The LH transition was 
set at 100 s. 

the real-time capable RAPTOR tokamak simulation code, and can model a 300s ITER 
discharge within 10s, with good agreement with previous modelling using CRONOS and 
the GLF23 transport model. 

This opens up many new possibilities for real-time controller design and validation, 
scenario preparation and optimization, and real-time discharge supervision. Such models 
can be used to design controllers for the plasma prohles using model-based controller 
design methods (e.g. [M] or [35]). The transport model can be used in closed-loop 
simulations to validate the designed controllers. Recent work on plasma ramp-up 
trajectory optimization [36] was carried out with an ad-hoc transport model, and can 
now be improved using this hrst-principle-based transport model. Also, this transport 
model can be used in real-time simulations to verify the measured plasma evolution and 
warn a supervisory control system of any unexpected deviations during the discharge 
|37j . Specihcally for ITER, the faster-than-real-time opens up the possibility of (on-line) 
real-time optimization of the discharge evolution in response to such unexpected events. 

While applications in the ITG transport regime are already feasible with this model, 
there remains much scope for expanding the number of input dimensions in the databases 
used for the hts, as well as employing slower yet more complete linear gyrokinetic 
codes for populating the database. Neural network topology complexity favourably 
scales linearly with input dimensionality. However, we estimate that uniform density 
population of the input dimensions, as carried out in this work, is feasible up to N~10. 
This is due to constraints on the NN training and quasilinear database calculation times. 
For higher dimensionalities, a training set which captures the natural correlations of 
the input parameters is then vital. This can be done by basing the training sets on 
experimental parameters and reasonable extrapolations thereof. This is a feasible goal, 
as also evidenced in Ref. ini. and this work is ongoing. 
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